---
author: Marcus Rohrmoser
categories:
- en
- development
date: "2011-08-05T06:48:19+00:00"
flattrss_autosubmited:
- "true"
tags:
- iOS
- iPhone
- like
- Search
- SQL
- SQLite
title: Rank sql(ite) text search results
type: post
url: /2011/08/rank-sqlite-text-search-results/
yourls_shorturl:
- http://mro.name/4a
yourls_tweeted:
- 1
---
When searching for text snippets in sql databases you might want to rank the results according to
&#8222;how good did it match&#8220;. And: the ranking shouldn't make the query slower.

Let's take a simple example using the [LIKE operator][1]. (I know, [FTS][2] does a better job, but
let's stick to like for now).

Assume the search expression 'a bc de' and a table 'my_table' with text columns 'title' and
'description'.

We want to find all rows with 'title' matching all three blank-separated parts of the search term:

<!--more-->

<pre class="line-numbers"><code class="language-sql">
SELECT rowid, title
FROM my_table
WHERE (title LIKE '%a%' AND title LIKE '%bc%' AND title LIKE '%de%')
</code></pre>

**To sort them, we apply a bonus for parts matching the column start**:

<pre class="line-numbers"><code class="language-sql">
SELECT rowid, title,
    -- column start bonus
    LIKE('a%', title) +
    LIKE('bc%', title) +
    LIKE('de%', title) +
0 AS bonus
FROM my_table
WHERE ((title LIKE '%a%') AND (title LIKE '%bc%') AND (title LIKE '%de%'))
ORDER BY bonus DESC, title ASC, rowid ASC
</code></pre>

Next, we'd like to add a (somewhat smaller) **bonus for word-starts**:

<pre class="line-numbers"><code class="language-sql">
SELECT rowid, title,
    -- column start bonus
    LIKE('a%', title) * 2 +
    LIKE('bc%', title) * 2 +
    LIKE('de%', title) * 2 +
    -- word start bonus
    LIKE('% a%', title) * 1 +
    LIKE('% bc%', title) * 1 +
    LIKE('% de%', title) * 1 +
0 AS bonus
FROM my_table
WHERE ((title LIKE '%a%') AND (title LIKE '%bc%') AND (title LIKE '%de%'))
ORDER BY bonus DESC, title ASC, rowid ASC
</code></pre>

Rows **matching the three terms in order** get an even bigger bonus:

<pre class="line-numbers"><code class="language-sql">
SELECT rowid, title,
    -- correct order bonus
    LIKE('%a%bc%de%', title) * 5 * 3 +
    -- column start bonus
    LIKE('a%', title) * 2 +
    LIKE('bc%', title) * 2 +
    LIKE('de%', title) * 2 +
    -- word start bonus
    LIKE('% a%', title) * 1 +
    LIKE('% bc%', title) * 1 +
    LIKE('% de%', title) * 1 +
0 AS bonus
FROM my_table
WHERE ((title LIKE '%a%') AND (title LIKE '%bc%') AND (title LIKE '%de%'))
ORDER BY bonus DESC, title ASC, rowid ASC
</code></pre>

And finally adding the **match on 'description' secondary**:

<pre class="line-numbers"><code class="language-sql">
SELECT rowid, title, description,
    -- title is primary match:
    -- correct order bonus
    LIKE('%a%bc%de%', title) * 50 * 3 +
    -- column start bonus
    LIKE('a%', title) * 20 +
    LIKE('bc%', title) * 20 +
    LIKE('de%', title) * 20 +
    -- word start bonus
    LIKE('% a%', title) * 10 +
    LIKE('% bc%', title) * 10 +
    LIKE('% de%', title) * 10 +
    -- description is secondary match:
    -- correct order bonus
    LIKE('%a%bc%de%', description) * 5 * 3 +
    -- column start bonus
    LIKE('a%', description) * 2 +
    LIKE('bc%', description) * 2 +
    LIKE('de%', description) * 2 +
    -- word start bonus
    LIKE('% a%', description) * 1 +
    LIKE('% bc%', description) * 1 +
    LIKE('% de%', description) * 1 +
0 AS bonus
FROM my_table
WHERE ((title       LIKE '%a%') AND (title       LIKE '%bc%') AND (title       LIKE '%de%'))
OR    ((description LIKE '%a%') AND (description LIKE '%bc%') AND (description LIKE '%de%'))
ORDER BY bonus DESC, title ASC, description ASC, rowid ASC
</code></pre>

You get the idea.

Funny thing is – the whole ranking logic doesn't hit performance (at least for small texts in the two columns)!

So, key is:

1. scan the table only once to find match candidates using the LIKE operator,
2. use the LIKE function(!) plus weighting-factors to compute a bonus for each hit,
3. evtl. add secondary matching columns.

P.S.: This post was inspired by a chat with [Deesa][3] on the way home riding [False Creek Ferry][4].

 [1]: http://www.sqlite.org/lang_expr.html#like
 [2]: http://blog.mro.name/2010/03/iphone-sqlite-fulltext-index/
 [3]: http://leetr.com/
 [4]: http://www.granvilleislandferries.bc.ca/
